Pertinent Prosodic Features for Speaker Identification by Voice
نویسندگان
چکیده
Most existing systems of speaker recognition use “state of the art” acoustic features. However, many times one can only recognize a speaker by his or her prosodic features, especially by the accent. For this reason, the authors investigate some pertinent prosodic features that can be associated with other classic acoustic features, in order to improve the recognition accuracy. The authors have developed a new prosodic model using a modified LVQ (Learning Vector Quantization) algorithm, which is called MLVQ (Modified LVQ). This model is composed of three reduced prosodic features: the mean of the pitch, original duration, and low-frequency energy. Since these features are heterogeneous, a new optimized metric has been proposed that is called Optimized Distance for Heterogeneous Features (ODHEF). Tests of speaker identification are done on Arabic corpus because the NIST evaluations showed that speaker verification scores depend on the spoken language and that some of the worst scores were got for the Arabic language. Experimental results show good performances of the new prosodic approach.
منابع مشابه
How vulnerable are prosodic features to professional imitators?
Voice imitation is one of the potential threats to security systems that use automatic speaker recognition. Since prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to how vulnerable these features are to voice mimicking. In this study, two experiments are conducted for twelve individual features in order to determine how a pr...
متن کاملRobustness of prosodic features to voice imitation
Prosody plays an important role in the human recognition process; therefore, prosodic elements are normally used by impersonators aiming to resemble someone else. Since such voice imitation is one of the potential threats to security systems relying on automatic speaker recognition, and prosodic features have been considered for state-of-the-art recognition systems in recent years, the question...
متن کاملUsing voice-quality measurements with prosodic and spectral features for speaker diarization
Jitter and shimmer voice-quality measurements have been successfully used to detect voice pathologies and classify different speaking styles. In this paper, we investigate the usefulness of jitter and shimmer voice measurements in the framework of the speaker diarization task. The combination of jitter and shimmer voice-quality features with the long-term prosodic and shortterm spectral feature...
متن کاملThe effect of bilateral subthalamic nucleus deep brain stimulation (STN-DBS) on the acoustic and prosodic features in patients with Parkinson’s disease: A study protocol for the first trial on Iranian patients
Background: The effect of subthalamic nucleus deep brain stimulation (STN-DBS) on the voice features in Parkinson’s disease (PD) is controversial. No study has evaluated the voice features of PD underwent STN-DBS by the acoustic, perceptual, and patient-based assessments comprehensively. Furthermore, there is no study to investigate prosodic features before and after DBS in PD. The curren...
متن کاملAutomatic speaker recognition as a measurement of voice imitation and conversion
Voices can be deliberately disguised by means of human imitation or voice conversion. The question arises to what extent they can be modified by using either method. In the current paper, a set of speaker identification experiments are conducted; first, analysing some prosodic features extracted from voices of professional impersonators attempting to mimic a target voice and, second, using both...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJMCMC
دوره 2 شماره
صفحات -
تاریخ انتشار 2010